Efficient multivariate entropy estimation via $k$-nearest neighbour distances
نویسندگان
چکیده
منابع مشابه
Efficient multivariate entropy estimation via k-nearest neighbour distances
Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this paper, we seek entropy estimators that are efficient in the sense of achieving the local asymptotic minimax lower bound. To this end, we initially study a generalisation of the estimator originally proposed by Ko...
متن کاملDivergence estimation for multidimensional densities via k-nearest-neighbor distances
A new universal estimator of divergence is presented for multidimensional continuous densities based on -nearest-neighbor ( -NN) distances. Assuming independent and identically distributed (i.i.d.) samples, the new estimator is proved to be asymptotically unbiased and mean-square consistent. In experiments with high-dimensional data, the -NN approach generally exhibits faster convergence than p...
متن کاملk-Nearest Neighbour Classifiers
Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier – classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance today because issues of poor run-time performance is not such...
متن کاملA non-parametric k-nearest neighbour entropy estimator
A non-parametric k-nearest neighbour based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering non-uniform probability densities in the region of k-nearest neighbours around each sample point. It aims at improving the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-fu...
متن کاملEfficient model selection for probabilistic K nearest neighbour classification
ProbabilisticK-nearest neighbour (PKNN) classification has been introduced to improve the performance of the original K-nearest neighbour (KNN) classification algorithm by explicitly modelling uncertainty in the classification of each feature vector. However, an issue common to both KNN and PKNN is to select the optimal number of neighbours, K. The contribution of this paper is to incorporate t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Statistics
سال: 2019
ISSN: 0090-5364
DOI: 10.1214/18-aos1688